Using Multiple-Sequence Alignment and Machine Learning Techniques to Improve Machine Translation

نویسندگان

  • Kevin Gimpel
  • Aravind Joshi
چکیده

As Internet access spreads to all nations of the globe, it is becoming increasingly important to be able to access information in a language other than that in which it was written. A survey of the free machine translation (MT) systems available on the Internet yields unsatisfying results; MT systems simply have not been able to improve fast enough to satisfy the demand for reliable, accurate translation on the level that a competent human translator could produce. The present work endeavors to begin with what machine translation ends with, to work from the admittedly flawed output of these systems and, through simple techniques, to attempt to create a superior machine translation. We shall discuss two approaches for doing so: (1) by making use of multiple machine translation systems and a biological sequencing algorithm known as multiplesequence alignment, and (2) in creating an n-gram replacer for domain-restricted input texts through the application of machine learning techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine learning algorithms in air quality modeling

Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...

متن کامل

Image Classification via Sparse Representation and Subspace Alignment

Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...

متن کامل

Image alignment via kernelized feature learning

Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...

متن کامل

Prenominal Modifier Ordering via Multiple Sequence Alignment

Producing a fluent ordering for a set of prenominal modifiers in a noun phrase (NP) is a problematic task for natural language generation and machine translation systems. We present a novel approach to this issue, adapting multiple sequence alignment techniques used in computational biology to the alignment of modifiers. We describe two training techniques to create such alignments based on raw...

متن کامل

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004